To date, the best-performing blind super-resolution (SR) techniques follow one of two paradigms: A) generate and train a standard SR network on synthetic low-resolution - high-resolution (LR - HR) pairs or B) attempt to predict the degradations an LR image has suffered and use these to inform a customised SR network. Despite significant progress, subscribers to the former miss out on useful degradation information that could be used to improve the SR process. On the other hand, followers of the latter rely on weaker SR networks, which are significantly outperformed by the latest architectural advancements. In this work, we present a framework for combining any blind SR prediction mechanism with any deep SR network, using a metadata insertion block to insert prediction vectors into SR network feature maps. Through comprehensive testing, we prove that state-of-the-art contrastive and iterative prediction schemes can be successfully combined with high-performance SR networks such as RCAN and HAN within our framework. We show that our hybrid models consistently achieve stronger SR performance than both their non-blind and blind counterparts. Furthermore, we demonstrate our framework's robustness by predicting degradations and super-resolving images from a complex pipeline of blurring, noise and compression.
translated by 谷歌翻译
传播相位对比度同步同步rotron MicrotoMography(PPC-SR $ {\ mu} $ CT)是对考古遗骸内部结构的非侵入性和非破坏性访问的黄金标准。在该分析中,需要分割虚拟标本以分开不同的部件或材料,通常需要相当多的人力努力的过程。在MicrotoMograph成像(ASEMI)项目的自动分割中,我们开发了一种自动分割这些容量图像的工具,使用手动分段样本来调谐和培训机器学习模型。对于一套四个古埃及动物木乃伊标本,与手动细分切片相比,达到了94-98%的整体准确性,使用深度学习(97-99%)接近现货商业软件的结果较低的复杂性。对分段输出的定性分析表明,我们的结果在对来自深度学习的人的可用性方面接近,证明了这些技术的使用。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Because noise can interfere with downstream analysis, image denoising has come to occupy an important place in the image processing toolbox. The most accurate state-of-the-art denoisers typically train on a representative dataset. But gathering a training set is not always feasible, so interest has grown in blind zero-shot denoisers that train only on the image they are denoising. The most accurate blind-zero shot methods are blind-spot networks, which mask pixels and attempt to infer them from their surroundings. Other methods exist where all neurons participate in forward inference, however they are not as accurate and are susceptible to overfitting. Here we present a hybrid approach. We first introduce a semi blind-spot network where the network can see only a small percentage of inputs during gradient update. We then resolve overfitting by introducing a validation scheme where we split pixels into two groups and fill in pixel gaps using domino tilings. Our method achieves an average PSNR increase of $0.28$ and a three fold increase in speed over the current gold standard blind zero-shot denoiser Self2Self on synthetic Gaussian noise. We demonstrate the broader applicability of Pixel Domino Tiling by inserting it into a preciously published method.
translated by 谷歌翻译
Federated learning is a collaborative method that aims to preserve data privacy while creating AI models. Current approaches to federated learning tend to rely heavily on secure aggregation protocols to preserve data privacy. However, to some degree, such protocols assume that the entity orchestrating the federated learning process (i.e., the server) is not fully malicious or dishonest. We investigate vulnerabilities to secure aggregation that could arise if the server is fully malicious and attempts to obtain access to private, potentially sensitive data. Furthermore, we provide a method to further defend against such a malicious server, and demonstrate effectiveness against known attacks that reconstruct data in a federated learning setting.
translated by 谷歌翻译
海面的风速检索对于科学和操作应用至关重要。除了天气模型,原位测量和遥感技术,尤其是卫星传感器外,还提供了互补的手段来监视风速。随着海面风产生传播水下的声音,水下声学录音也可以传递与风向相关的信息。尽管模型驱动的方案,尤其是数据同化方法,是解决地球科学反向问题的最新方案,但机器学习技术变得越来越有吸引力,可以完全利用观察数据集的潜力。在这里,我们介绍了一种深度学习方法,用于从水下声学中检索风速序列,这可能是由其他数据源(例如天气模型重新分析)进行补充的。我们的方法桥接数据同化和基于学习的框架,以从先前的物理知识和计算效率中受益。实际数据上的数值实验表明,我们优于最先进的数据驱动方法,其相对增益就RMSE而言高达16%。有趣的是,这些结果支持水下声学数据的时间动力学的相关性,以更好地告知风速的时间演变。他们还表明,在这里,多模式数据(此处的水下声学数据与ECMWF重新分析数据相结合)可能会进一步改善重建性能,包括相对于缺少水下的声学声学数据的鲁棒性。
translated by 谷歌翻译
前庭造型瘤(VS)通常从内耳生长到大脑。它可以分为两个区域,分别对应于内耳管内或外部。外部区域的生长是决定疾病管理的关键因素,其次是临床医生。在这项工作中,提出了将细分分为内部/优质零件的VS分割方法。我们注释了一个由227个T2 MRI实例组成的数据集,对137名患者进行了纵向获得,不包括术后实例。我们提出了一种分阶段的方法,第一阶段进行整个肿瘤分割,第二阶段使用T2 MRI以及从第一阶段获得的掩码进行了术中/极度分割。为了提高预测的肉类边界的准确性,我们引入了特定于任务的损失,我们称之为边界距离损失。与直接仪内分割任务性能(即基线)相比,评估了该性能。我们所提出的方法采用两阶段方法和边界距离损失,分别达到0.8279+-0.2050和0.7744+-0.1352,分别为室外和室内室内区域,显着提高了基线,这给出了0.7939+的骰子得分-0.2325和0.7475+-0.1346分别用于室外和室内区域。
translated by 谷歌翻译
从药物研究到解剖模型,腹部腹部登记具有各种应用。然而,由于人类腹部的形态异质性和可变性,它仍然是一个充满挑战的应用。在为此任务提出的各种注册方法中,概率位移注册模型通过比较两个图像的点的特征向量来估计点子集的位移分布。这些概率模型具有信息性和健壮性,同时允许设计大量位移。由于位移分布通常是在点子集(我们称为驾驶点)上估算的,因此由于计算要求,我们建议在这项工作中学习驾驶点预测指标。与先前提出的方法相比,以端到端方式优化了驾驶点预测变量,以推断针对特定注册管道定制的驾驶点。我们评估了我们的贡献对与不同模式相对应的两个不同数据集的影响。具体而言,我们比较了使用驱动点预测器或其他2种标准驾驶点选择方法之一时,比较了6种不同概率位移登记模型的性能。提出的方法改善了12个实验中的11个。
translated by 谷歌翻译
最近的自我监督方法使用了大规模的图像文本数据集来学习强大的表示,这些表示无需填补即可将其转移到许多任务。这些方法通常假定其图像与其(简短)字幕之间存在一对一的对应关系。但是,许多任务需要有关多个图像和长文本叙述的推理,例如描述带有视觉摘要的新闻文章。因此,我们探索了一个新颖的环境,其目标是学习一个自我监督的视觉语言表示,该表示对改变文本长度和图像数量是可靠的。此外,与假设字幕的先前工作不同,我们假设图像仅包含与文本的宽松说明对应关系。为了探索这个问题,我们介绍了一个大规模的多模式数据集,其中包含31m文章,22m图像和1M视频。我们表明,对具有多个图像的更长叙述,最新的图像文本对齐方法并不强大。最后,我们介绍了一个直观的基线,该基线在GoodNews数据集上在零摄像集检索上胜过10%。
translated by 谷歌翻译
域适应(DA)最近在医学影像社区提出了强烈的兴趣。虽然已经提出了大量DA技术进行了用于图像分割,但大多数这些技术已经在私有数据集或小公共可用数据集上验证。此外,这些数据集主要解决了单级问题。为了解决这些限制,与第24届医学图像计算和计算机辅助干预(Miccai 2021)结合第24届国际会议组织交叉模态域适应(Crossmoda)挑战。 Crossmoda是无监督跨型号DA的第一个大型和多级基准。挑战的目标是分割参与前庭施瓦新瘤(VS)的后续和治疗规划的两个关键脑结构:VS和Cochleas。目前,使用对比度增强的T1(CET1)MRI进行VS患者的诊断和监测。然而,使用诸如高分辨率T2(HRT2)MRI的非对比度序列越来越感兴趣。因此,我们创建了一个无人监督的跨模型分段基准。训练集提供注释CET1(n = 105)和未配对的非注释的HRT2(n = 105)。目的是在测试集中提供的HRT2上自动对HRT2进行单侧VS和双侧耳蜗分割(n = 137)。共有16支球队提交了评估阶段的算法。顶级履行团队达成的表现水平非常高(最佳中位数骰子 - vs:88.4%; Cochleas:85.7%)并接近完全监督(中位数骰子 - vs:92.5%;耳蜗:87.7%)。所有顶级执行方法都使用图像到图像转换方法将源域图像转换为伪目标域图像。然后使用这些生成的图像和为源图像提供的手动注释进行培训分割网络。
translated by 谷歌翻译